Communication using avatar

ABSTRACT

Generally this disclosure describes a video communication system that replaces actual live images of the participating users with animated avatars. A method may include selecting an avatar, initiating communication, capturing an image, detecting a face in the image, extracting features from the face, converting the facial features to avatar parameters, and transmitting at least one of the avatar selection or avatar parameters.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a continuation of U.S. patent applicationSer. No. 15/184,409 titled “COMMUNICATION USING AVATAR” filed on Jun.16, 2016 which is a continuation of U.S. patent application Ser. No.13/993,612 titled “COMMUNICATION USING AVATAR” filed on Apr. 14, 2014,now U.S. Pat. No. 9,398,262, which is a National Stage Entry ofPCT/CN2011/084902 filed Dec. 29, 2011, the entire disclosures of whichare incorporated herein by reference.

FIELD

The following disclosure relates to video communication and interaction,and, more particularly, to methods and for video communication andinteraction using avatars.

BACKGROUND

The increasing variety of functionality available in mobile devices hasspawned a desire for users to communicate via video in addition tosimple calls. For example, users may initiate “video calls,”“videoconferencing,” etc., wherein a camera and microphone in a devicetransmits audio and real-time video of a user to one or more otherrecipients such as other mobile devices, desktop computers,videoconferencing systems, etc. The communication of real time video mayinvolve the transmission of substantial amounts of data (e.g., dependingon the technology of the camera, the particular video codec employed toprocess the real time image information, etc.). Given the bandwidthlimitations of existing 2G/3G wireless technology, and the still limitedavailability of emerging 4G wireless technology, the proposition of manydevice users conducting concurrent video calls places a large burden onbandwidth in the existing wireless communication infrastructure, whichmay impact negatively on the quality of the video call.

BRIEF DESCRIPTION OF THE DRAWINGS

Features and advantages of various embodiments of the claimed subjectmatter will become apparent as the following Detailed Descriptionproceeds, and upon reference to the Drawings, wherein like numeralsdesignate like parts, and in which:

FIG. 1A illustrates an example device-to-device system in accordancewith various embodiments of the present disclosure;

FIG. 1B illustrates an example virtual space system in accordance withvarious embodiments of the present disclosure;

FIG. 2. illustrates an example device in accordance with variousembodiments of the present disclosure;

FIG. 3 illustrates an example system implementation in accordance withat least one embodiment of the present disclosure; and

FIG. 4 is a flowchart of example operations in accordance with at leastone embodiment of the present disclosure.

Although the following Detailed Description will proceed with referencebeing made to illustrative embodiments, many alternatives, modificationsand variations thereof will be apparent to those skilled in the art.

DETAILED DESCRIPTION

Generally, this disclosure describes systems and methods for videocommunication and interaction using avatars. Using avatars, as opposedto live images, substantially reduces the amount of data to betransmitted, and thus, the avatar communication requires less bandwidth.In one embodiment an application is activated in a device coupled to acamera. The application may be configured to allow a user to select anavatar for display on a remote device, in a virtual space, etc. Thedevice may then be configured to initiate communication with at leastone other device, a virtual space, etc. For example, the communicationmay be established over a 2G, 3G, 4G cellular connection. Alternatively,the communication may be established over the Internet via a WiFiconnection. After the communication is established, the camera may beconfigured to start capturing images. Facial detection/tracking is thenperformed on the captured images, and feature extraction is performed onthe face. The detected face/head movements and/or changes in facialfeatures are then converted into parameters usable for animating theavatar on the at least one other device, within the virtual space, etc.At least one of the avatar selection or avatar parameters are thentransmitted. In one embodiment at least one of a remote avatar selectionor remote avatar parameters are received. The remote avatar selectionmay cause the device to display an avatar, while the remote avatarparameters may cause the device to animate the displayed avatar. Audiocommunication accompanies the avatar animation via known methods.

FIG. 1A illustrates device-to-device system 100 consistent with variousembodiments of the present disclosure. System 100 may generally includedevices 102 and 112 communicating via network 122. Device 102 includesat least camera 104, microphone 106 and display 108. Device 112 includesat least camera 114, microphone 116 and display 118. Network 122includes at least server 124.

Devices 102 and 112 may include various hardware platforms that arecapable of wired and/or wireless communication. For example, devices 102and 112 may include, but are not limited to, videoconferencing systems,desktop computers, laptop computers, tablet computers, smart phones,(e.g., iPhones®, Android®-based phones, Blackberries®, Symbian®-basedphones, Palm®-based phones, etc.), cellular handsets, etc. Cameras 104and 114 include any device for capturing digital images representativeof an environment that includes one or more persons, and may haveadequate resolution for face analysis of the one or more persons in theenvironment as described herein. For example, cameras 104 and 114 mayinclude still cameras (e.g., cameras configured to capture stillphotographs) or a video cameras (e.g., cameras configured to capture amoving images comprised of a plurality of frames). Cameras 104 and 114may be configured to operate using light in the visible spectrum or withother portions of the electromagnetic spectrum not limited to theinfrared spectrum, ultraviolet spectrum, etc. Cameras 104 and 114 may beincorporated within devices 102 and 112, respectively, or may beseparate devices configured to communicate with devices 102 and 112 viawired or wireless communication. Specific examples of cameras 104 and114 may include wired (e.g., Universal Serial Bus (USB), Ethernet,Firewire, etc.) or wireless (e.g., WiFi, Bluetooth, etc.) web cameras asmay be associated with computers, video monitors, etc., mobile devicecameras (e.g., cell phone or smart phone cameras integrated in, forexample, the previously discussed example devices), integrated laptopcomputer cameras, integrated tablet computer cameras (e.g., iPad®,Galaxy Tab®, and the like), etc. Devices 102 and 112 may furthercomprise microphones 106 and 116.

Microphones 106 and 116 include any devices configured to sense sound.Microphones 106 and 116 may be integrated within devices 102 and 112,respectively, or may interact with the devices via wired or wirelesscommunication such as described in the above examples regarding cameras104 and 114. Displays 108 and 118 include any devices configured todisplay text, still images, moving images (e.g., video), userinterfaces, graphics, etc. Displays 108 and 118 may be integrated withindevices 102 and 112, respectively, or may interact with the devices viawired or wireless communication such as described in the above examplesregarding cameras 104 and 114. In one embodiment, displays 108 and 118are configured to display avatars 110 and 120, respectively. Asreferenced herein, an Avatar is defined as graphical representation of auser in either two-dimensions (2D) or three-dimensions (3D). Avatars donot have to resemble the looks of the user, and thus, while avatars canbe lifelike representations they can also take the form of drawings,cartoons, sketches, etc. In system 100, device 102 may display avatar110 representing the user of device 112 (e.g., a remote user), andlikewise, device 112 may display avatar 120 representing the user ofdevice 102. In this way users may see a representation of others userwithout having to exchange the large amounts of information involvedwith device-to-device communication employing live images.

Network 122 may include various second generation (2G), third generation(3G), fourth generation (4G) cellular-based data communicationtechnologies, Wi-Fi wireless data communication technology, etc. Network122 includes at least one server 124 configured to establish andmaintain communication connections when using these technologies. Forexample, server 124 may be configured to support Internet-relatedcommunication protocols like Session Initiation Protocol (SIP) forcreating, modifying and terminating two-party (unicast) and multi-party(multicast) sessions, Interactive Connectivity Establishment Protocol(ICE) for presenting a framework that allows protocols to be built ontop of bytestream connections, Session Traversal Utilities for NetworkAccess Translators, or NAT, Protocol (STUN) for allowing applicationsoperating through a NAT to discover the presence of other NATs, IPaddresses and ports allocated for an application's User DatagramProtocol (UDP) connection to connect to remote hosts, Traversal UsingRelays around NAT (TURN) for allowing elements behind a NAT or firewallto receive data over Transmission Control Protocol (TCP) or UDPconnections, etc.

FIG. 1B illustrates virtual space system 126 consistent with variousembodiments of the present disclosure. System 126 may employ device 102,device 112 and server 124. Device 102, device 112 and server 124 maycontinue to communicate in the manner similar to that illustrated inFIG. 1A, but user interaction may take place in virtual space 128instead of in a device-to-device format. As referenced herein, a virtualspace may be defined as a digital simulation of a physical location. Forexample, virtual space 128 may resemble an outdoor location like a city,road, sidewalk, field, forest, island, etc., or an inside location likean office, house, school, mall, store, etc. Users, represented byavatars, may appear to interact in virtual space 128 as in the realworld. Virtual space 128 may exist on one or more servers coupled to theInternet, and may be maintained by a third party. Examples of virtualspaces include virtual offices, virtual meeting rooms, virtual worldslike Second Life®, massively multiplayer online role-playing games(MMORPGs) like World of Warcraft®, massively multiplayer onlinereal-life games (MMORLGs), like The Sims Online®, etc. In system 126,virtual space 128 may contain a plurality of avatars corresponding todifferent users. Instead of displaying avatars, displays 108 and 118 maydisplay encapsulated (e.g., smaller) versions of virtual space (VS) 128.For example, display 108 may display a perspective view of what theavatar corresponding to the user of device 102 “sees” in virtual space128. Similarly, display 118 may display a perspective view of what theavatar corresponding to the user of device 112 “sees” in virtual space128. Examples of what avatars might see in virtual space 128 include,but are not limited to, virtual structures (e.g., buildings), virtualvehicles, virtual objects, virtual animals, other avatars, etc.

FIG. 2 illustrates an example device 102 in accordance with variousembodiments of the present disclosure. While only device 102 isdescribed, device 112 (e.g., remote device) may include resourcesconfigured to provide the same or similar functions. As previouslydiscussed, device 102 is shown including camera 104, microphone 106 anddisplay 108. Camera 104 and microphone 106 may provide input to cameraand audio framework module 200. Camera and audio framework module 200may include custom, proprietary, known and/or after-developed audio andvideo processing code (or instruction sets) that are generallywell-defined and operable to control at least camera 104 and microphone106. For example, camera and audio framework module 200 may cause camera104 and microphone 106 to record images and/or sounds, may processimages and/or sounds, may cause images and/or sounds to be reproduced,etc. Camera and audio framework module 200 may vary depending on device102, and more particularly, the operating system (OS) running in device102. Example operating systems include iOS®, Android®, Blackberry® OS,Symbian®, Palm® OS, etc. Speaker 202 may receive audio information fromcamera and audio framework module 200 and may be configured to reproducelocal sounds (e.g., to provide audio feedback of the user's voice) andremote sounds (e.g., the sound of the other parties engaged in atelephone, video call or interaction in a virtual place).

Facial detection and tracking module 204 may be configured to identifyand track a head, face and/or facial region within image(s) provided bycamera 104. For example, facial detection module 204 may include custom,proprietary, known and/or after-developed face detection code (orinstruction sets), hardware, and/or firmware that are generallywell-defined and operable to receive a standard format image (e.g., butnot limited to, a RGB color image) and identify, at least to a certainextent, a face in the image. Facial detection and tracking module 204may also be configured to track the detected face through a series ofimages (e.g., video frames at 24 frames per second) and to determine ahead position based on the detected face. Known tracking systems thatmay be employed by facial detection/tracking module 104 may includeparticle filtering, mean shift, Kalman filtering, etc., each of whichmay utilize edge analysis, sum-of-square-difference analysis, featurepoint analysis, histogram analysis, skin tone analysis, etc.

Feature extraction module 206 may be configured to recognize features(e.g., the location and/or shape of facial landmarks such as eyes,eyebrows, nose, mouth, etc.) in the face detected by face detectionmodule 204. In one embodiment, avatar animation may be based directly onsensed facial actions (e.g., changes in facial features) without facialexpression recognition. The corresponding feature points on an avatar'sface may follow or mimic the movements of the real person's face, whichis known as “expression clone” or “performance-driven facialanimation.”Feature extraction module 206 may include custom,proprietary, known and/or after-developed facial characteristicsrecognition code (or instruction sets) that are generally well-definedand operable to receive a standard format image (e.g., but not limitedto a RGB color image) from camera 104 and to extract, at least to acertain extent, one or more facial characteristics in the image. Suchknown facial characteristics systems include, but are not limited to,the CSU Face Identification Evaluation System by Colorado StateUniversity.

Feature extraction module 206 may also be configured to recognize anexpression associated with the detected features (e.g., identifyingwhether a previously detected face happy, sad, smiling, frown,surprised, excited, etc.)). Thus, feature extraction module 206 mayfurther include custom, proprietary, known and/or after-developed facialexpression detection and/or identification code (or instruction sets)that is generally well-defined and operable to detect and/or identifyexpressions in a face. For example, feature extraction module 206 maydetermine size and/or position of the facial features (e.g., eyes,mouth, cheeks, teeth, etc.) and may compare these facial features to afacial feature database which includes a plurality of sample facialfeatures with corresponding facial feature classifications (e.g.,smiling, frown, excited, sad, etc.).

Avatar selection module 208 is configured to allow a user of device 102to select an avatar for display on a remote device. Avatar selectionmodule 208 may include custom, proprietary, known and/or after-developeduser interface construction code (or instruction sets) that aregenerally well-defined and operable to present different avatars to auser so that the user may select one of the avatars. In one embodimentone or more avatars may be predefined in device 102. Predefined avatarsallow all devices to have the same avatars, and during interaction onlythe selection of an avatar (e.g., the identification of a predefinedavatar) needs to be communicated to a remote device or virtual space,which reduces the amount of information that needs to be exchanged.Avatars are selected prior to establishing communication, but may alsobe changed during the course of an active communication. Thus, it may bepossible to send or receive an avatar selection at any point during thecommunication, and for the receiving device to change the displayedavatar in accordance with the received avatar selection.

Avatar control module 210 is configured to generate parameters foranimating an avatar. Animation, as referred to herein, may be defined asaltering the appearance of an image/model. A single animation may alterthe appearance of a 2-D still image, or multiple animations may occur insequence to simulate motion in the image (e.g., head turn, nodding,blinking, talking, frowning, smiling, laughing, winking, blinking, etc.)An example of animation for 3-D models includes deforming a 3-Dwireframe model, applying a texture mapping, and re-computing the modelvertex normal for rendering. A change in position of the detected faceand/or extracted facial features may be may converted into parametersthat cause the avatar's features to resemble the features of the user'sface. In one embodiment the general expression of the detected face maybe converted into one or more parameters that cause the avatar toexhibit the same expression. The expression of the avatar may also beexaggerated to emphasize the expression. Knowledge of the selectedavatar may not be necessary when avatar parameters may be appliedgenerally to all of the predefined avatars. However, in one embodimentavatar parameters may be specific to the selected avatar, and thus, maybe altered if another avatar is selected. For example, human avatars mayrequire different parameter settings (e.g., different avatar featuresmay be altered) to demonstrate emotions like happy, sad, angry,surprised, etc. than animal avatars, cartoon avatars, etc. Avatarcontrol module 208 may include custom, proprietary, known and/orafter-developed graphics processing code (or instruction sets) that aregenerally well-defined and operable to generate parameters for animatingthe avatar selected by avatar selection module 208 based on theface/head position detected by face detection and tracking module 204and/or the facial features detected by feature extraction module 206.For facial feature-based animation methods, 2-D avatar animation may bedone with, for example, image warping or image morphing, whereas 3-Davatar animation may be done with free form deformation (FFD) or byutilizing the animation structure defined in a 3-D model of a head.Oddcast is an example of a software resource usable for 2-D avataranimation, while FaceGen is an example of a software resource usable for3-D avatar animation.

In addition, in system 100 avatar control module 210 may receive aremote avatar selection and remote avatar parameters usable fordisplaying and animating an avatar corresponding to a user at a remotedevice. Avatar control module may cause display module 212 to displayavatar 110 on display 108. Display module 208 may include custom,proprietary, known and/or after-developed graphics processing code (orinstruction sets) that are generally well-defined and operable todisplay and animate an avatar on display 108 in accordance with theexample device-to-device embodiment. For example, avatar control module210 may receive a remote avatar selection and may interpret the remoteavatar selection to correspond to a predetermined avatar. Display module212 may then display avatar 110 on display 108. Moreover, remote avatarparameters received in avatar control module 210 may be interpreted, andcommands may be provided to display module 212 to animate avatar 110. Inone embodiment more than two users may engage in the video call. Whenmore than two users are interacting in a video call, display 108 may bedivided or segmented to allow more than one avatar corresponding toremote users to be displayed simultaneously. Alternatively, in system126 avatar control module 210 may receive information causing displaymodule 212 to display what the avatar corresponding to the user ofdevice 102 is “seeing” in virtual space 128 (e.g., from the visualperspective of the avatar). For example, display 108 may displaybuildings, objects, animals represented in virtual space 128, otheravatars, etc. In one embodiment avatar control module 210 may beconfigured to cause display module 212 to display “feedback” avatar 214.Feedback avatar 214 represents how the selected avatar appears on theremote device, in a virtual place, etc. In particular, feedback avatar214 appears as the avatar selected by the user and may be animated usingthe same parameters generated by avatar control module 210. In this waythe user may confirm what the remote user is seeing during theirinteraction.

Communication module 216 is configured to transmit and receiveinformation for selecting avatars, displaying avatars, animatingavatars, displaying virtual place perspective, etc. Communication module216 may include custom, proprietary, known and/or after-developedcommunication processing code (or instruction sets) that are generallywell-defined and operable to transmit avatar selections, avatarparameters and receive remote avatar selections and remote avatarparameters. Communication module 216 may also transmit and receive audioinformation corresponding to avatar-based interactions. Communicationmodule 216 may transmits and receive the above information via network122 as previously described.

FIG. 3 illustrates an example system implementation in accordance withat least one embodiment. Device 102′ is configured to communicatewirelessly via WiFi connection 300 (e.g., at work), server 124′ isconfigured to negotiate a connection between devices 102′ and 112′ viaInternet 302, and apparatus 112′ is configured to communicate wirelesslyvia another WiFi connection 304 (e.g., at home). In one embodiment adevice-to-device avatar-based video call application is activated inapparatus 102′. Following avatar selection, the application may allow atleast one remote device (e.g., device 112′) to be selected. Theapplication may then cause device 102′ to initiate communication withdevice 112′. Communication may be initiated with device 102′transmitting a connection establishment request to device 112′ viaenterprise access point (AP) 306. Enterprise AP 306 may be an AP usablein a business setting, and thus, may support higher data throughput andmore concurrent wireless clients than home AP 314. Enterprise AP 306 mayreceive the wireless signal from device 102′ and may proceed to transmitthe connection establishment request through various business networksvia gateway 308. The connection establishment request may then passthrough firewall 310, which may be configured to control informationflowing into and out of the WiFi network 300.

The connection establishment request of device 102′ may then beprocessed by server 124′. Server 124′ may be configured for registrationof IP addresses, authentication of destination addresses and NATtraversals so that the connection establishment request may be directedto the correct destination on Internet 302. For example, server 124′ mayresolve the intended destination (e.g., remote device 112′) frominformation in the connection establishment request received from device102′, and may route the signal to through the correct NATs, ports and tothe destination IP address accordingly. These operations may only haveto be performed during connection establishment, depending on thenetwork configuration. In some instances operations may be repeatedduring the video call in order to provide notification to the NAT tokeep the connection alive. Media and Signal Path 312 may carry the video(e.g., avatar selection and/or avatar parameters) and audio informationdirection to home AP 314 after the connection has been established.Device 112′ may then receive the connection establishment request andmay be configured to determine whether to accept the request.Determining whether to accept the request may include, for example,presenting a visual narrative to a user of device 112′ inquiring as towhether to accept the connection request from device 102′. Should theuser of device 112′ accept the connection (e.g., accept the video call)the connection may be established. Cameras 104′ and 114′ may beconfigured to then start capturing images of the respective users ofdevices 102′ and 112′, respectively, for use in animating the avatarsselected by each user. Microphones 106′ and 116′ may be configured tothen start recording audio from each user. As information exchangecommences between devices 102′ and 112′, displays 108′ and 118′ maydisplay and animate avatars corresponding to the users of devices 102′and 112′.

FIG. 4 is a flowchart of example operations in accordance with at leastone embodiment. In operation 402 an application (e.g., an avatar-basedvoice call application) may be activated in a device. Activation of theapplication may be followed by selection of an avatar. Selection of anavatar may include an interface being presented by the application, theinterface allowing the user to select a predefined avatar. After avatarselection, communications may be configured in operation 404.Communication configuration includes the identification of at least oneremote device or a virtual space for participation in the video call.For example, a user may select from a list of remote users/devicesstored within the application, stored in association with another systemin the device (e.g., a contacts list in a smart phone, cell phone,etc.), stored remotely, such as on the Internet (e.g., in a social mediawebsite like Facebook, LinkedIn, Yahoo, Google+, MSN, etc.).Alternatively, the user may select to go online in a virtual space likeSecond Life.

In operation 406, communication may be initiated between the device andthe at least one remote device or virtual space. For example, aconnection establishment request may be transmitted to the remote deviceor virtual space. For the sake of explanation herein, it is assumed thatthe connection establishment request is accepted by the remote device orvirtual space. A camera in the device may then begin capturing images inoperation 408. The images may be still images or live video (e.g.,multiple images captured in sequence). In operation 410 image analysismay occur starting with detection/tracking of a face/head in the image.The detected face may then be analyzed in order to extract facialfeatures (e.g., facial landmarks, facial expression, etc.). In operation412 the detected face/head position and/or facial features are convertedinto Avatar parameters. Avatar parameters are used to animate theselected avatar on the remote device or in the virtual space. Inoperation 414 at least one of the avatar selection or the avatarparameters may be transmitted.

Avatars may be displayed and animated in operation 416. In the instanceof device-to-device communication (e.g., system 100), at least one ofremote avatar selection or remote avatar parameters may be received fromthe remote device. An avatar corresponding to the remote user may thenbe displayed based on the received remote avatar selection, and may beanimated based on the received remote avatar parameters. In the instanceof virtual place interaction (e.g., system 126), information may bereceived allowing the device to display what the avatar corresponding tothe device user is seeing. A determination may then be made in operation418 as to whether the current communication is complete. If it isdetermined in operation 418 that the communication is not complete,operations 408-416 may repeat in order to continue to display andanimate an avatar on the remote apparatus based on the analysis of theuser's face. Otherwise, in operation 420 the communication may beterminated. The video call application may also be terminated if, forexample, no further video calls are to be made.

While FIG. 4 illustrates various operations according to an embodiment,it is to be understood that not all of the operations depicted in FIG. 4are necessary for other embodiments. Indeed, it is fully contemplatedherein that in other embodiments of the present disclosure, theoperations depicted in FIG. 4 and/or other operations described hereinmay be combined in a manner not specifically shown in any of thedrawings, but still fully consistent with the present disclosure. Thus,claims directed to features and/or operations that are not exactly shownin one drawing are deemed within the scope and content of the presentdisclosure.

As used in any embodiment herein, the term “module” may refer tosoftware, firmware and/or circuitry configured to perform any of theaforementioned operations. Software may be embodied as a softwarepackage, code, instructions, instruction sets and/or data recorded onnon-transitory computer readable storage medium. Firmware may beembodied as code, instructions or instruction sets and/or data that arehard-coded (e.g., nonvolatile) in memory devices. “Circuitry”, as usedin any embodiment herein, may comprise, for example, singly or in anycombination, hardwired circuitry, programmable circuitry such ascomputer processors comprising one or more individual instructionprocessing cores, state machine circuitry, and/or firmware that storesinstructions executed by programmable circuitry. The modules may,collectively or individually, be embodied as circuitry that forms partof a larger system, for example, an integrated circuit (IC), systemon-chip (SoC), desktop computers, laptop computers, tablet computers,servers, smart phones, etc.

Any of the operations described herein may be implemented in a systemthat includes one or more storage mediums having stored thereon,individually or in combination, instructions that when executed by oneor more processors perform the methods. Here, the processor may include,for example, a server CPU, a mobile device CPU, and/or otherprogrammable circuitry. Also, it is intended that operations describedherein may be distributed across a plurality of physical devices, suchas processing structures at more than one different physical locations.The storage medium may include any type of tangible medium, for example,any type of disk including hard disks, floppy disks, optical disks,compact disk read-only memories (CD-ROMs), compact disk rewritables(CD-RWs), and magneto-optical disks, semiconductor devices such asread-only memories (ROMs), random access memories (RAMs) such as dynamicand static RAMs, erasable programmable read-only memories (EPROMs),electrically erasable programmable read-only memories (EEPROMs), flashmemories, Solid State Disks (SSDs), magnetic or optical cards, or anytype of media suitable for storing electronic instructions. Otherembodiments may be implemented as software modules executed by aprogrammable control device. The storage medium may be non-transitory.

Thus, the present disclosure provides a method and system for conductinga video communication using avatars instead of live images. The use ofavatars reduces the amount of information to exchange as compared to thesending of live images. An avatar is selected and then communication maybe established. A camera in each device may captures images of theparticipants. The images may be analyzed to determine face position andfacial features. The face position and/or facial features are thenconverted into avatar parameters, and at least one of the avatarselection or the avatar parameters are transmitted to display/animate.

According to one aspect there is provided a method. The method mayinclude selecting an avatar, initiating communication, capturing animage, detecting a face in the image, extracting features from the face,converting the facial features to avatar parameters, and transmitting atleast one of the avatar selection or avatar parameters.

According to another aspect there is provided a system. The system mayinclude a camera configured to capture images, a communication moduleconfigured to transmit and receive information, and one or more storagemediums. In addition, the one or more storage mediums having storedthereon, individually or in combination, instructions that when executedby one or more processors result in the following operations comprisingselecting an avatar, initiating communication, capturing an image,detecting a face in the image, extracting features from the face,converting the facial features to avatar parameters, and transmitting atleast one of the avatar selection or avatar parameters.

According to another aspect there is provided a system. The system mayinclude one or more storage mediums having stored thereon, individuallyor in combination, instructions that when executed by one or moreprocessors result in the following operations comprising selecting anavatar, initiating communication, capturing an image, detecting a facein the image, extracting features from the face, converting the facialfeatures to avatar parameters, and transmitting at least one of theavatar selection or avatar parameters.

The terms and expressions which have been employed herein are used asterms of description and not of limitation, and there is no intention,in the use of such terms and expressions, of excluding any equivalentsof the features shown and described (or portions thereof), and it isrecognized that various modifications are possible within the scope ofthe claims. Accordingly, the claims are intended to cover all suchequivalents.

What is claimed:
 1. One or more non-transitory computer-readable storagedevices having instructions stored thereon that, when executed by atleast one processor of a first computing device, result in operationscomprising: enable selection of a first avatar for a video call betweenthe first computing device and a second computing device; identify oneor more facial features of a user of the first computing device for thevideo call; generate information for the video call, to be transmittedto the second computing device, to cause the first selected avatar toappear animated on a display of the second computing device; wherein theinformation is based on the identified one or more facial features ofthe user of the first computing device; cause display of the firstselected avatar on the first computing device for the video call toenable the user of the first computing device to observe an appearanceof the first selected avatar on the second computing device; enableselection of a second avatar for the video call; generate secondinformation for the video call, to be transmitted to the secondcomputing device, to cause the second selected avatar to appear animatedon the display of the second computing device; wherein the secondinformation is based on the identified one or more facial features ofthe user of the first computing device; and cause display of the secondselected avatar on the first computing device for the video call toenable the user of the first computing device to observe an appearanceof the second selected avatar on the second computing device.
 2. The oneor more storage devices of claim 1, wherein the one or more facialfeatures are to be identified from one or more video images of the userof the first computing device.
 3. The one or more storage devices ofclaim 1, wherein the instructions, when executed by the at least oneprocessor of the first computing device, result in additional operationscomprising: process audio information of the user of the first computingdevice to be transmitted to the second computing device.
 4. A firstcomputing device to conduct a video call with a second computing deviceusing avatars, the first computing device comprising: memory circuitryto store instructions and data; a display device to display an avatar;and processor circuitry to process one or more instructions to performoperations comprising: enable selection of a first avatar for the videocall; identify one or more facial features of a user of the firstcomputing device for the video call; generate information for the videocall, to be transmitted to the second computing device, to cause thefirst selected avatar to appear animated on a display of the secondcomputing device; wherein the information is based on the identified oneor more facial features of the user of the first computing device; causedisplay of the first selected avatar on the first computing device forthe video call to enable the user of the first computing device toobserve an appearance of the first selected avatar on the secondcomputing device; enable selection of a second avatar for the videocall; generate second information for the video call, to be transmittedto the second computing device, to cause the second selected avatar toappear animated on the display of the second computing device; whereinthe second information is based on the identified one or more facialfeatures of the user of the first computing device; and cause display ofthe second selected avatar on the first computing device for the videocall to enable the user of the first computing device to observe anappearance of the second selected avatar on the second computing device.5. The first computing device of claim 4, further comprising: a videocamera device to capture one or more video images of the user of thefirst computing device; wherein the one or more facial features are tobe identified from the one or more captured video images of the user ofthe first computing device.
 6. The first computing device of claim 4,further comprising an audio capture device to capture audio informationof the user of the first computing device to be transmitted to thesecond computing device.
 7. A method of communicating using avatars,comprising: enabling, by a first computing device, selection of a firstavatar for a video call between the first computing device and a secondcomputing device; identifying, by the first computing device, one ormore facial features of a user of the first computing device for thevideo call; generating, by the first computing device for the videocall, information to be transmitted to the second computing device, tocause the first selected avatar to appear animated on a display of thesecond computing device; wherein the information is based on theidentified one or more facial features of the user of the firstcomputing device; displaying, by the first computing device, the firstselected avatar on the first computing device for the video call toenable the user of the first computing device to observe an appearanceof the first selected avatar on the second computing device; enabling,by the first computing device, selection of a second avatar for thevideo call; generating, by the first computing device for the videocall, second information, to be transmitted to the second computingdevice, to cause the second selected avatar to appear animated on thedisplay of the second computing device; wherein the information is basedon the identified one or more facial features of the user of the firstcomputing device; and displaying, by the first computing device for thevideo call, the second selected avatar on the first computing device toenable the user of the first computing device to observe an appearanceof the second selected avatar on the second computing device.
 8. Themethod of claim 7, wherein the one or more facial features are to beidentified from one or more video images of the user of the firstcomputing device.
 9. The method of claim 7, further comprising:processing, by the first computing device, audio information of the userof the first computing device to be transmitted to the second computingdevice.
 10. A first computing device to conduct a video call with asecond computing device using avatars, the first computing devicecomprising: an avatar selection module to enable selection of at least afirst and a second avatar for the video call; a feature extractionmodule to identify one or more facial features of a user of the firstcomputing device for the video call; an avatar control module togenerate information for the video call, to be transmitted to the secondcomputing device, to cause the first selected avatar to appear animatedon a display of the second computing device; wherein the information isbased on the identified one or more facial features of the user of thefirst computing device; and a display module to display the firstselected avatar on the first computing device for the video call toenable the user of the first computing device to observe an appearanceof the first selected avatar on the second computing device.
 11. Thefirst computing device of claim 10, wherein the avatar control module isfurther to generate second information, to be transmitted to the secondcomputing device, to cause the second selected avatar to appear animatedon the display of the second computing device; wherein the secondinformation is based on the identified one or more facial features ofthe user of the first computing device.
 12. The first computing deviceof claim 11, wherein the display module is further to display the secondselected avatar on the first computing device to enable the user of thefirst computing device to observe an appearance of the second selectedavatar on the second computing device.
 13. The first computing device ofclaim 10, further comprising a facial detection and tracking module todetect and track a face of the user of the first computing device. 14.The first computing device of claim 10, further comprising an audiocapture device to capture audio information of the user of the firstcomputing device to be transmitted to the second computing device. 15.The first computing device of claim 10, further comprising a videocapture device to capture one or more video images of the user of thefirst computing device; wherein the one or more facial features are tobe identified from the one or more captured video images of the user ofthe computing device.